Vowel Recognition of Patients after Total Laryngectomy using Mel Frequency Cepstral Coefficients and Mouth Contour
نویسندگان
چکیده
The paper addresses a problem of isolated vowels recognition in patients following total laryngectomy. The visual and acoustic speech modalities were separately incorporated in the machine learning algorithms. The authors used the Mel Frequency Cepstral Coefficients as acoustic descriptors of a speech signal. A lip contour was extracted from a video signal of the speaking faces using OpenCV software library. In a vowels recognition procedure the three types of classifiers were used for comparison purposes: Artificial Neural Networks, Support Vector Machines and Naive Bayes. The highest recognition rate was evaluated using Support Vector Machines. For a group of the laryngectomees having a different quality of speech the authors achieved 75% for acoustic and 40% for visual recognition performances. The authors obtained higher recognition rate than in a previous research where 10 cross-sectional areas of a vocal tract were estimated. Using presented image processing algorithm the visual features can be extracted automatically from a video signal.
منابع مشابه
Voice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملPronunciation recognition of English phonemes /\textipa{@}/, /æ/, /\textipa{A}: / and /\textipa{2}/ using Formants and Mel Frequency Cepstral Coefficients
The Vocal Joystick Vowel Corpus, by Washington University, was used to study monophthongs pronounced by native English speakers. The objective of this study was to quantitatively measure the extent at which speech recognition methods can distinguish between similar sounding vowels. In particular, the phonemes /@/, /æ/, /A:/ and /2/ were analysed. 748 sound files from the corpus were used and su...
متن کاملRecognition of Tamil Syllables Using Vowel Onset Points with Production, Perception Based Features
Tamil Language is one of the ancient Dravidian languages spoken in south India. Most of the Indian languages are syllabic in nature and syllables are in the form of Consonant-Vowel (CV) units. In Tamil language, CV pattern occurs in the beginning, middle and end of a word. In this work, CV Units formed with Stop Consonant – Short Vowel (SCSV) were considered for classification task. The work ca...
متن کاملMel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition
We’ve examined the speaker discriminative power of mel-, antimeland linear-frequency cepstral coefficients (MFCCs, aMFCCs and LFCCs) in the nasal, vowel, and non-nasal consonant speech regions. Our inspiration came from the work of Lu and Dang in 2007, who showed that filterbank energies at some frequencies mainly outside the telephone bandwidth possess more speaker discriminative power due to ...
متن کاملMFCC and Prosodic Feature Extraction Techniques:
In this paper our main aim to provide the difference between cepstral and non-cepstral feature extraction techniques. Here we try to cover-up most of the comparative features of Mel Frequency Cepstral Coefficient and prosodic features. In speaker recognition, there are two type of techniques are available for feature extraction: Short-term features i.e. Mel Frequency Cepstral Coefficient (MFCC)...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010